A Comparative Analysis Between EPIC Static Instruction Scheduling and DTSVLIW Dynamic Instruction Scheduling
نویسندگان
چکیده
To achieve performance, Explicitly Parallel Instruction Computing (EPIC) systems take the responsibility of extracting instruction-level parallelism (ILP) from the hardware and give it to the compiler. They expose a large part of the hardware control at the conventional machine level. Dynamically Trace Scheduled VLIW (DTSVLIW) systems, on the other hand, leave the responsibility of extracting ILP to the hardware and use conventional compilers. Their hardware uses a simple hardware implemented scheduling algorithm – executed dynamically – to exploit ILP and achieve performance. This work examines three compiler/EPIC architecture combinations (SGI PRO64 C++ Compiler/IA64, Intel C++ Compiler 5.0.1/IA64 and Trimaran 2.0/HPL-PD) and compares these with a compiler/DTSVLIW architecture combination (gcc/Alpha-DTSVLIW). Our experiments show that, on average, the DTSVLIW architecture achieves better performance than EPIC because its dynamic scheduler, although much simpler, harnesses more ILP due to its exploitation of execution-time information invisible to the EPIC compiler’s scheduler.
منابع مشابه
On the Scheduling Algorithm of the Dynamically Trace Scheduled VLIW Architecture
In a machine that follows the dynamically trace scheduled VLIW (DTSVLIW) architecture, VLIW instructions are built dynamically through an algorithm that can be implemented in hardware. These VLIW instructions are cached so that the machine can spend most of its time executing VLIW instructions without sacrificing any binary compatibility. This paper evaluates the effectiveness of the DTSVLIW in...
متن کاملOn the Effectiveness of the Scheduling Algorithm of the Dynamically Trace Scheduled VLIW Architecture
In a machine that follows the dynamically trace scheduled VLIW (DTSVLIW) architecture, VLIW instructions are built dynamically through a scheduling algorithm that can be implemented in hardware. These VLIW instructions are cached so that the machine can spend most of its time executing VLIW instructions without sacrificing any binary compatibility. This paper evaluates the effectiveness of the ...
متن کاملDTSVLIW: VLIW Performance with Sequential Code
Due to the temporal execution locality present in programs, even small instruction caches (16-Kbyte) can provide processors with fast access to instructions most of the time. The Dynamically Trace Scheduled VLIW (DTSVLIW) architecture exploits programs’ temporal execution locality by executing code in two distinct modes. In the first execution encounter, fragments of the code are executed in ...
متن کاملCSE231 project report —- survey on instruction scheduling
This paper surveys past research on instruction scheduling for exploiting more Instruction Level Parallelism (ILP). We focus on static instruction scheduling performed by compiler. The hardware platform for implementing such compiler techniques, i.e. VLIW is also reviewed. We also give comparison between the code scheduling done dynamically by out-of-order machines and that by compilers, along ...
متن کاملSingle Instruction Fetch Does Not Inhibit Instruction-Level Parallelism
Superscalar machines fetch multiple scalar instructions per cycle from the instruction cache. However, machines that fetch no more than one instruction per cycle from the instruction cache, such as Dynamic Trace Scheduled VLIW (DTSVLIW) machines, have shown performances comparable to that of Superscalars. In this paper, we present experiments that show that fetching a single instruction from th...
متن کامل